Andrzej P§kalski networks of scientific interests with internal degrees of freedom 

through self-citation analysis 
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Old and recent theoretical works by Andrzej P§kalski (APE) are recalled as possible sources of in- 
terest for describing network formation and clustering in complex (scientific) communities, through 
self-organisation and percolation processes. Emphasis is placed on APE self-citation network over 
four decades. The method is that used for detecting scientists' field mobility by focusing on au- 
thor's self-citation, co-authorships and article topics networks as in 1, 2]. It is shown that APE's 
self-citation patterns reveal important information on APE interest for research topics over time 
as well as APE engagement on different scientific topics and in different networks of collaboration. 
Its interesting complexity results from "degrees of freedom" and external fields leading to so called 
internal shock resistance. It is found that APE network of scientific interests belongs to indepen- 
dent clusters and occurs through rare or drastic events as in irreversible "preferential attachment 
processes", similar to those found in usual mechanics and thermodynamics phase transitions. 

PACS numbers: 



I. INTRODUCTION 



Self-citation analysis is part of wide bibliometric anal- 
ysis of scientific and scholarly citation patterns [l], 0] . Of- 
ten in much of the recent literature in citation analysis 
author's self-citations are excluded as 'noise' or are 
treated as a bias for the analysis (e.g. [f| H, @] ) , whence 
contempted or used to draw a negative conclusion on an 
author activity. We disagree with such a line of thought. 
As an example, consider Andrzej P§kalski self-citations 
in his published work over his present career. 

Andrzej Ptjkalski was born in Warsaw on Nov. 02, 
1937. After the war he moved with his parents to Wro- 
claw. He graduated from the University of Wroclaw in 
1960, and got a Ph.D. in Theoretical Physics from the 
Academy of Sciences Low Temperature and Structural 
Research Institute, in Wroclaw, in 1970. One of us (MA) 
met Andrzej P§kalski (APE) at a MECO conference in 
Bled (then in Yugoslavia), in 1976. Both were interested 
in Random Spin Systems (RSS) and quickly published a 
paper on Physical Properties of a Spin Model described 
by an Effective Hamiltonian with Two Kinds of Random 
Magnetic Bonds 0,0, Fig.l. APE introduced MA to his 
wife, Joanna, and to one of his friends Jacek M. Kowalski, 
among others. MA also introduced his wife in their sci- 
entific activities. Within a close family-like cluster they 
published several papers resulting from the study of the 
so called Magnetic Lattice Gas (MLG) [lcj . i.e. inserting 
on the classical lattice gas model a new degree of free- 
dom, a spin, for the basic entity. This leads to the study 
of the Blume-Emery-Griffiths model, with particular in- 
teractions, in view of obtaining some information on the 
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FIG. 1: Marcel Ausloos and Andrzej P§kalski seen discussing 
Physical Properties of a Spin Model or Oxygen Diffusion in 
YBCO, at a Karpacz winter school |9| 



approximations [Utll5l|. 

Going away from the study of static properties, of a 
fluid, seems a logical step, followed by APE, who started 
next to work on two-dimensional Diffusion (2DD) 
E.g., with one of the authors (MA), he studied exten- 
sively the tracer (surface) diffusion coefficient of oxygen 
in deficient CuO planes of YBCO i^Gl. Most likely 
this led him to study molecular [lj| phase formation (and 
reactive) trapping, and by extension, entity trapping, 
thus predator- prey models l20L l2ll and population evolu- 
tions or dynamics [12, [H, EJlHlj- Such biophysics con- 
siderations are being pretty close to or entangled into its 
macroscopic counterpart , the behavior of human popula- 
tion [HI], APE went on to study Bio-Socio-Econo (BSE)- 
physics problems, like Model of Wealth and Goods Dy- 
namics in a Closed Market [26| and recently prison riots 

APE scientific journey during the last four decades, 
both in terms of number of publications or collabora- 
tions and moves to different research fields, can be more 
adequately followed by focussing on how the network of 
citations to his own published papers develops over time. 
These self-citations may carry important information on 
how the scientist sees his own work within his/her in- 
vestigations at the time of self-citation. We do not an- 
swer the question whether self-citation pays or distorts 
the reliability of scientific impact measures [28[ . We will 
show that we disagree with suggestions to remove self- 
citations [28| from citation counts. On the contrary we 
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confirm that self-citations can be used for understand- 
ing the network structure of an author [29]. We have 
already shown [l|, 0] that there is some interest in ex- 
amining self-citations in order to provide some insight 
on the creativity and change of field of interest in a sci- 
entist. To do so, we use a recently developed method to 
analyze self-citations in combination with co-authorships 
(and keywords) in the self-citing articles as a potential 
tool for tracing scientists' field mobility p], based on a 
de-clustering method finding its root in percolation the- 
ory ideas [30, [H[. Moreover this sort of consideration 
ties the present research to the newly expanding research 
field about social communities [32l |. in which questions 
pertain to the network number of links, triangles, ... and 
the modelization of complex (social or other) networks in 
order to represent statistical features, evolution, ... - not 
mentioning the questions of universality and robustness. 
Subsequently, our research questions are: 

1. Which structures do APE self-citation networks en- 
tail? 

2. Can these structures be interpreted in terms of re- 
search topics or fields? 

3. Can self-citation networks be used as a means to 
uncover APE field mobility, if any? 

4. Are changes in APE co-authorship associated with 
changes in the self-citation networks? 

5. Do emerging scientific collaborations in terms of 
possible changes in the co-authorships also indicate 
something about APE focus of study? 

The publication record of APE is of interest for doing 
so, in view of recent studies on other authors, like Werner 
Ebeling [l|,|2|, for three reasons: First, both have founded 
or led a scientific school in theoretical physics. Second, 
it is "known" (see above for APE) that both have been 
engaged in networks of changing collaborators over time, 
and, third, their publication record is markedly different. 
In this paper we take self-citations as a source of informa- 
tion on the development of the scientific career of APE, 
quantify and visualize his scientific journey through re- 
search fields. There will be no consideration on some 
"work quality", the definition being unclear to the au- 
thors, nor about the number of citations by others, nor 
any impact factor. 

The paper is organized as follows. In section II, we 
introduce the data set and the method used. In section 
III, we present our results obtained using the method. 
Thereafter, in section IV, we discuss the results. 

II. DATA AND METHOD 

In a recent CV, released in April 2007, APE claims 
to have 90 papers in (we quote him) "international re- 
viewed journals". In fact, his publication record can be 



downloaded in several ways : first from his webpage [33j , 
second from the Web of Science, (the Science Citation In- 
dex, the Social Sciences Citation Index and the Arts and 
Humanities Index), using the Boolean search operation 
Pekalski A (not Pekalski ) in the author field (General 
search interface). The APE web page indicates 90 arti- 
cles since his first in 1966 [34| ■ The Web of Science (ac- 
cessed, May 08, 2007) surprisingly recalls 95 such articles 
since 1969 [35) . This record, of course, only encompasses 
articles that are indexed in the citation index databases, 
i.e. we excluded from this analysis : (i) articles published 
in not ISI-indexed (36| journals, (ii) not yet published pa- 
pers at the time of publication of the latter, (iii) books 
and book chapters. Self-citations here are taken as the 
papers with APE as one of the authors which cite other 
papers with APE as one of the authors. We do not nor- 
malize the number of self-citations, neither e.g. to the 
number of papers of the author of coauthors, nor to the 
number of citations in one or the whole paper(s). Let 
us note that ISI has also backwards indexed all authors, 
so one gets all papers of Pekalski, no matter if he stands 
first or second or .... author. 

There are several possible sources of disagreement be- 
tween the data "banks" 

1. papers are not in the Web of Science because they 
did not appear in ISI journals. In this case the 
work can only be improved if for these papers the 
reference lists are available. 

2. papers from the list which are in ISI publica- 
tions have not been found because of misspelling 
or different spelling of the name of the author. 
Pekalski might sometimes appear as P-kalski , but 
Pekalski is not so a troublesome case,- we have 
only found a homonym A. Pekalski in Delft, in en- 
gineering. In fact, such possible false records usu- 
ally get sorted out by not being linked to the self- 
citation network of the main target. Since all co- 
authors are indexed, one will not likely miss papers 
because "Pekalski" is not the first author. 

In fact the authors are aware that APE web pag e also 
misses some articles, like [9j,l37|, the second one [37| being 
surely in the ISI citation index [3(| the impact factor of 
the "journal" in which it is published being e.g. 0.513 in 
2004 HI]. 

Notice that we have not searched for nor considered 
papers published in books or proceedings, articles which 
could be also self-quoted. A posteriori, it seems that 
such a slight neglect has not influenced the conclusions 
to come. Therefore in the following we use the 95 Web 
of Science articles as the empirical data basis. 

In this article, we use a specific method that focuses 
on percolated islands of nodes, is called the Optimal Per- 
colation Method (OPM) 0,0, and is a variant of perco- 
lation idea-based methods (PIBM) used in order to re- 
veal structures in complex networks [13, [H[ . One of the 
advantages of OPM is the rapid identification of the re- 
sulting division of the whole network into sub-structures. 
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Recall that the OPM differs from the PIBM in at least 
two ways. On one hand, OPM applies to unweighted 
networks, while PIBM apply to correlations matrices 
(=weighted networks). On the other hand, in OPM, one 
removes nodes instead of links. 

On OPM, we define the connectivity c of the network 
as the number of pairs of nodes that can be joined by 
a path (of arbitrary length). By definition, c is equal 
to N(N — l)/2, where iV is the number of nodes in the 
connected network. Then, we search for k nodes so that 
when they are removed from the network, the connectiv- 
ity c is minimal. These k nodes define the intersection of 
structures that we identify as different sub-groups. Fi- 
nally, we plot the uncovered sub-groups in different tones 
for the sake of clarity. 

There is therefore also a huge difference between OPM 
and PIBM. In PIBM, one removes links depending on 
their value and then observes how the system breaks into 
clusters. In OPM, in contrast, one looks for the nodes 
that, if removed, optimize the breaking of the system, 
and one removes them. 

We concentrate here below on the appearance of 'clus- 
ters', that means groups of articles which are more inten- 
sively linked to each other than to the rest. We interpret 
this clustering as a sign that the articles in a sub-group 
have something in common. 

The analysis proceeds in five steps: First, we take a 
look at the frequency of the ISI-indexed articles authored 
or co-authored by APE over time. Second, the overall 
structure of his self-citation network is examined. Third, 
we apply the OPM method to this network. Next, we 
analyze the co-authorships in these articles. Finally, we 
show the overall development over article types as a func- 
tion of the year of publication and the detected topics. 



III. RESULTS 
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FIG. 2: Number of published papers in a given year written 
by Andrzej P§kalski: black line: (95) ISI indexed articles; 
dashed line: (90) from his web page 
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FIG. 3: Cumulative distribution of the 95 ISI indexed articles 
written by Andrzej P§kalski as a function of time; dashed line: 
same for the 90 articles mentioned on his web page 



A. Basic Statistics and Clusters 

We start with the empirical time evolution of the num- 
ber of published articles by APE (and his co-authors) as 
a function of the year of publishing the article (Fig. 2). 
We see that APE has been continuously productive dur- 
ing the last 40 years. His yearly productivity fluctuates 
between and 9 articles, with an average around 3 pa- 
pers/year. He has sometimes gaps in his production, but 
his productivity takes off in 1980 and accelerates at the 
end of last century (Fig. 3). Observe the plateau ending 
in 1994 or so. 

The total number of APE co-authors is equal to 11; the 
number (20) of single- authored papers stays remarkably 
persistent over time, - in this list. The network of his 
co-authors varies and of course grows over time. Whence 
to further reveal the structure of APE scientific journey 
it is of interest to look for the position of the articles in 
his self-citation network. 



As a next step in our analysis, we apply the Opti- 
mum Percolation Method algorithm to the percolated 
island [45j. We deduce 3 sub-groups, as verified in Fig. 
4. The structures, plotted in greyblack, and white, are 
composed of 9, 7 and 32 nodes, respectively, and 18, 19 
and 68 links (citations from this list) respectively. Note, 
that the clusters represented in Figure 4 should not be 
read as evolutionary trees. Different subgroups contain 
articles from different (years) points in time. Later on, we 
will make also the temporal structure of the self-citation 
network visible. 

For completeness, let us observe what papers are most 
often quoted, i.e. what seems the most relevant ones for 
the authors; those who get the most links, and how many 

1. 6 times: A. Pekalski and M. Ausloos: Physical 
Properties of a Spin Model described by an Effec- 
tive Hamiltonian with Two Kinds of Random Mag- 
netic Bonds [8( 
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5 times: M. Ausloos, P. Clippe, J.M. Kowalski, and 
A. P§kalski: Magnetic Lattice Gas [l(| 



3. 17 times: I. Mroz, A. Pekalski, and K. Sznajd- 
Weron: Conditions for adaptation of an evolving 
population [23[ 



In order to characterize the trends of activity or re- 
search fields in the career of APE, as well as to verify the 
pertinence of the automatic classification, we can take a 
look at the keywords associated to the articles in each 
cluster, when available. There are only a few overlaps 
between the three clusters; this confirms the relevance of 
the three revealed sub-groups. Therefore clusters in the 
self-citation network(s) can be used to demark different 
research fields, whence the creativity and adaptation of 
the author. 

The interpretation of the three different structures is 
not difficult if one is acquainted with APE work. The first 
(grey plotted) area is related to articles written about 
random/disordered spin models. Work in this area be- 
longs to classical streams in statistical mechanics theory. 
The second area (black plotted) strictly contains work on 
the MLG physics. This belongs to models in which an 
extra degree of freedom allows coupling with new exter- 
nal fields and widely extends the phase diagram [3^, ■ 
Interestingly one could imagine a strong connexion with 
the RSS model works; it is not for P§kalski. The third 
and largest area (white plotted) represent a huge branch 
of investigations in APE work, namely diffusion, adapta- 
tion, self-organisation research, which is entangled in a 
competition of entities problems, like the prey-predator 
problems. These are much less specific and classic topics 
within statistical physics, though exist since studies in 
population studies and dynamics, going back to Malthus 
and Verhulst. It is remarkable that for Pekalski these are 
connected questions. 

Another possible difference between the identified sub- 
fields is the respective co-authorship networks in the clus- 
ters. We have identified the most present co-authors in 
each structure (Table Q] ). The empirical results show 
that the respective lists of co-authors are 'loosely simi- 
lar', namely there is no strong dependence between the 
research field or topics of the paper and the co-author 
with whom the paper was written, — as deduced from 
this self-citation list. Nevertheless note that only a few 
APE co-authors occur in three clusters, and very few in 
two clusters. 

The list of collaborators makes the structure of the 
APE degrees of freedom visible. Many of these co- 
authors are either local students or senior colleagues; two 
co-authors are from the Liege group. This shows the in- 
tense connexions of APE, his restricted degrees of free- 
dom, for this sort of work and its/his evolution. 
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TABLE I: Most frequent co-authors appearing in the cluster 
structures. The column No shows the number of co-authored 
articles by the given co-author, and in parentheses the total 
number of different co-authors in the specific cluster. We 
only indicate those appearing at least twice in the given clus- 
ter 



B. Time dependence 

So far, we have shown the correlation between self- 
citation clusters and co-authorships in the self-citing ar- 
ticles of APE. In this subsection, we analyse the temporal 
structure, the scientific journey, in APE research activi- 
ties over time as perceived through his self-citations. As 
shown above, the percolation method has led to a decom- 
position of the self-citation network into three disjoint 
structures, that we represent with black, white and grey 
for reasons of visualization. In order to evaluate the time 
evolution of the author's career, we draw (Fig. 5) a se- 
ries of boxes, each representing one article, from the first 
published paper to the last published paper, - in this list 
of self-citating papers. This leads to a rapid visualization 
of the periods of activities of the author in each subfield. 
We see that APE activities in different research fields are 
very concentrated at different periods in time. There is 
barely any overlap, reminding us of phase transitions; see 
for a related discussion on works by many authors 41 1. 

During the 1970-80s, for example, his research was 
clearly directed toward the RSS, i.e. classical roads to- 
ward joining the club of fundamentalists in discrete mod- 
els in statistical mechanics. Between 1985 and 1995 
or so, APE was involved with quite a limited number 
of co-authors for these papers on MLG. In this period 
APE contributed to tying mathematics with physics. 
The spreading of the ideas around self-organization, ir- 
reversible processes and non-linear dynamics in physics, 
due to exo- and/or endogenous effects on whatever popu- 
lation of entities came after the 1984 plateau (Fig. 3). In 
addition to the articles analysed here, notice that APE 
organised several winter schools and edited several books, 
like [43]. These new topics related to questions of bio- 
physics, economy and social dynamics are the seats of an 
intense activity nowadays; the data indicates that APE 
can stick to a competitive field and participate in its evo- 
lution. 

It is interesting to note that the transition from one 
subfield to the other is rather sharp and irreversible, i.e. 
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the author does not return to a subfield after an inactivity 
time, and seems to remain active in a subfield over long 
time periods. He resists external shocks, or maintains a 
" high internal resistance" to " external fields" . 

It is also worth noting (Fig. 3) that productivity, if 
measured in terms of publications per year, is increasing 
over time: While it took APE ten years to publish the 
first ten ISI-indexed self-citing articles, it only took six 
years to publish ISI-indexed eighteen articles on BSE, 
or about 12 years for the last 24 papers. This result is 
consistent with a recent study showing that scientists' 
productivity over time increases during their career [43| , 
- up to the declining years ! 

IV. DISCUSSION 

In much of the recent literature in citation analysis 
3, author's self-citations are excluded as 'noise' or are 
treated as a bias for the analysis (e.g. [f| H, Q). We 
use a recently developed method to analyze self-citations 
in combination with co-authorships in the self-citing ar- 
ticles of a well known author (APE) as apotential tool 
for tracing scientists field mobility [l|, 0, [3(| [M| and if 
possible for the causes. 

In the case of APE, a great number (48), yet a small 
majority, of the 95 articles, in such a list, is linked to 
each other by self-citation; three clusters emerge with in 
chronological order 9, 7 and 32 nodes or articles for a 
total of 48 articles; thus 47 are not connected. The clus- 
ters are well separated. Because we are interested in the 
network of self-citations, the disconnected articles have 
been excluded from the study, whence network structure. 
As compared to Ebcling, APE has more articles outside 
"his percolated island". It is known to one of the au- 
thors (MA) that APE is not prone to have a long list of 
references in his papers, whatever that may mean. 

Five steps in our analysis build upon and support each 
other, hence result in an emphasis of consistent patterns 
in the development of the career of APE. It is shown 
that APE's self-citation patterns reveal important infor- 
mation on APE interest for research topics over time as 
well as APE engagement in different networks of collab- 
oration. It is found that APE network of scientific inter- 
ests belongs to independent clusters and occurs through 
rare or drastic events which results from "preferential at- 
tachment processes" , to some coauthor group, as in usual 
mechanics and thermodynamics formalism of phase tran- 
sitions. In some sense it can be conjectured that this 
interesting complexity results from " degrees of freedom" 
coupled to external fields leading to internal motivation, 
though submitted, most likely, to shock resistance. 



Indeed there is a strong connection (seen in the quite 
finite size of the clusters, and list of co-authors) between 
the co-authorships and the topics used in the self-citing 
articles of the author. Altogether, these results seem to 
justify the use of self-citation networks as a key signa- 
ture of a scientist career. In the case of Ebeling Q, Q, it 
was found that the OPM analysis suggests that chang- 
ing co-authorships drive the changing research interests 
and move to new research topics. The same is true here, 
but in the opposite case, i.e. when there is not much 
change in the coauthors/groups. The Optimal Percola- 
tion Method, therefore, can well serve as a relevant tool 
for detecting the development of trends in a scientific 
community or in the scientific career of an author, highly 
productive or not. 

It seems that in the case of Pekalski , the OPM has 
some statistical limitations since almost half of his papers 
lack self-citations whence are excluded form the analysis! 
It might be interesting to take a look at the topics of 
these not connected nodes in the network. It seems that 
the OPM works best in cases where the author heavily 
cites his/her other articles, or rather when the value of 
k to be taken into account for breaking apart the main 
cluster is finite. 

Moreover if one aim is to trace scientists' careers using 
one address, in the search, might exclude several articles 
from the ISI data set. This is not the case to our knowl- 
edge for P§kalski, for which no address was inserted in 
the data extraction, nor for Ebcling, in fact, for which 
one address was used. 

For biographical research concentrated on single au- 
thors, the method reveals interesting additional informa- 
tion. To interpret the motivations for the occurrence of 
certain research fields one has to look into the biogra- 
phy of an author, or have personal knowledge about the 
author. External changes as political conditions, or ge- 
ographical moves but also visits of conferences, in- 
vited guest positions and longer stays abroad, should, or 
might, trigger new collaborations and new research topics 
which, in turn, should become visible in the patterns of 
self-citations. That seems to be somewhat the case here, 
but research on other authors should be worthwhile to 
confirm such effects. 
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FIG. 4: The Optimum Percolation Method applied to Andrzej 
P§kalski's self-citation network. The three revealed structures 
are plotted in grey (top left), black (lower left corner) and 
white (right hand side), corresponding to Random Spin Sys- 
tems (RSS), Magnetic Lattice Gas (MLG), and Bio-Socio- 
Econo (BSE)-physics respectively 




FIG. 5: Time evolution of the article type as a function of the 
year/article number 



